Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Handwritten Gujarati Numeral Optical Character Recognition using Hybrid Feature Extraction Technique

Identifieur interne : 000778 ( Main/Exploration ); précédent : 000777; suivant : 000779

Handwritten Gujarati Numeral Optical Character Recognition using Hybrid Feature Extraction Technique

Auteurs : Apurva A. Desai [Inde]

Source :

RBID : Pascal:12-0306480

Descripteurs français

English descriptors

Abstract

Lot of work has been done for Optical Character Recognition (OCR) for various Indian languages. But for Gujarati, a language belonging to Devnagari family of languages and spoken in the western state of Gujarat in India, hardly any work can be traced especially for handwritten characters. In this work I have addressed the problem of handwritten Gujarati numerals. Here pre-process techniques like global threshold, erosion and dilation, skew correction etc, are used. A novel hybrid feature extraction technique is suggested and used which is constituted by a structural approach and statistical approach of feature extraction. Image is subdivided and then the pixel information is used as a structural approach whereas the aspect ratio of the number is considered as a statistical approach. For classification kNN classifier has been used. This model gives overall accuracy of 96.99% for the handwritten Gujarati numerals.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Handwritten Gujarati Numeral Optical Character Recognition using Hybrid Feature Extraction Technique</title>
<author>
<name sortKey="Desai, Apurva A" sort="Desai, Apurva A" uniqKey="Desai A" first="Apurva A." last="Desai">Apurva A. Desai</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Computer Science, Veer Narmad South Gujarat University, Udhna Magdalla Road</s1>
<s2>Surat - 395007, Gujarat</s2>
<s3>IND</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Inde</country>
<wicri:noRegion>Surat - 395007, Gujarat</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">12-0306480</idno>
<date when="2010">2010</date>
<idno type="stanalyst">PASCAL 12-0306480 INIST</idno>
<idno type="RBID">Pascal:12-0306480</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000089</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000683</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000152</idno>
<idno type="wicri:Area/Main/Merge">000783</idno>
<idno type="wicri:Area/Main/Curation">000778</idno>
<idno type="wicri:Area/Main/Exploration">000778</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Handwritten Gujarati Numeral Optical Character Recognition using Hybrid Feature Extraction Technique</title>
<author>
<name sortKey="Desai, Apurva A" sort="Desai, Apurva A" uniqKey="Desai A" first="Apurva A." last="Desai">Apurva A. Desai</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Computer Science, Veer Narmad South Gujarat University, Udhna Magdalla Road</s1>
<s2>Surat - 395007, Gujarat</s2>
<s3>IND</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Inde</country>
<wicri:noRegion>Surat - 395007, Gujarat</wicri:noRegion>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Aspect ratio</term>
<term>Character recognition</term>
<term>Classification</term>
<term>Feature extraction</term>
<term>Language family</term>
<term>Manuscript character</term>
<term>Mathematical morphology</term>
<term>Modeling</term>
<term>Natural language</term>
<term>Optical character recognition</term>
<term>Pattern extraction</term>
<term>Pattern recognition</term>
<term>Statistical analysis</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance caractère</term>
<term>Reconnaissance forme</term>
<term>Caractère manuscrit</term>
<term>Reconnaissance optique caractère</term>
<term>Famille langage</term>
<term>Langage naturel</term>
<term>Morphologie mathématique</term>
<term>Classification</term>
<term>Extraction caractéristique</term>
<term>Rapport aspect</term>
<term>Extraction forme</term>
<term>Analyse statistique</term>
<term>Modélisation</term>
<term>.</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Lot of work has been done for Optical Character Recognition (OCR) for various Indian languages. But for Gujarati, a language belonging to Devnagari family of languages and spoken in the western state of Gujarat in India, hardly any work can be traced especially for handwritten characters. In this work I have addressed the problem of handwritten Gujarati numerals. Here pre-process techniques like global threshold, erosion and dilation, skew correction etc, are used. A novel hybrid feature extraction technique is suggested and used which is constituted by a structural approach and statistical approach of feature extraction. Image is subdivided and then the pixel information is used as a structural approach whereas the aspect ratio of the number is considered as a statistical approach. For classification kNN classifier has been used. This model gives overall accuracy of 96.99% for the handwritten Gujarati numerals.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Inde</li>
</country>
</list>
<tree>
<country name="Inde">
<noRegion>
<name sortKey="Desai, Apurva A" sort="Desai, Apurva A" uniqKey="Desai A" first="Apurva A." last="Desai">Apurva A. Desai</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000778 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000778 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:12-0306480
   |texte=   Handwritten Gujarati Numeral Optical Character Recognition using Hybrid Feature Extraction Technique
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024